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Message-Passing Decoding of Lattices Using Gaussian Mixtures 



Brian M. Kurkoski* 

Abstract — A lattice decoder which represents messages 
explicitly as a mixture of Gaussians functions is given. In 
order to prevent the number of functions in a mixture from 
growing as the decoder iterations progress, a method for 
replacing N Gaussian functions with M Gaussian functions, 
with M < N, is given. A squared distance metric is used to 
select functions for combining. A pair of selected Gaussians 
is replaced by a single Gaussian with the same first and 
second moments. The metric can be computed efficiently, 
and at the same time, the proposed algorithm empirically 
gives good results, for example, a dimension 100 lattice has 
a loss of 0.2 dB in signal-to-noise ratio at a probability of 
symbol error of 10 -5 . 
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1 Introduction 

Lattices play a central role in many communica- 
tion problems. While Shannon used a non-lattice, and 
non-constructive, Euclidean-space code to compute the 
capacity of the AWGN channel, recently Erez and Za- 
mir showed that lattice encoding and decoding can also 
achieve the capacity of the AWGN channel PQ. Sim- 
ilarly, for the problem of communication with known 
noise, which has applications to multiuser communi- 
cations and information hiding, lattice codes play an 
important role [SJ. In source coding, lattices may be 
used for lossy compression of a real- valued source. 

To approach theoretical capacities, it is necessary to 
let the dimension of the lattice or code become asymp- 
totically large. However, for most lattices of interest, 
the decoder complexity is worse than linear in the di- 
mension, and most studied lattices have small dimen- 
sion. For example, a frequently cited reference on lat- 
tice decoding gives experimental results with a maxi- 
mum dimension of 45 [3] . Other approaches use trellis- 
based lattices, which are exponentially complex in the 
number of states [I] . Historically, finite-field error cor- 
recting codes also suffered the same complexity limita- 
tion, however, with the advent of iteratively-decoded 
low-density parity check codes and turbo codes, the 
theoretical capacity of some binary-input communica- 
tion channels can be achieved [5]. 

Recently, a new lattice construction and decoding 
algorithm, based upon the ideas of low-density parity 
check codes has been introduced. So-called low-density 
lattice codes (LDLC) are lattices defined by sparse in- 
verse generator matrix with a pseudo-random construc- 
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tion. Decoding is performed iteratively using message- 
passing, and complexity is linear in the block length. 
Sommer, Feder and Shalvi, who proposed this lattice 
and decoder, demonstrated decoding with dimension 
as high as 10 6 . However, the experiments considered 
decoding only for a special communications problem 
where the transmit power is unconstrained. Comments 
in their paper suggest that the algorithm did not con- 
verge when applied to the more important problem of 
general lattice decoding [6] [7]. 

When decoding on the AWGN channel, the LDLC 
decoder messages are continuous- valued functions, 
which can be exactly represented by a mixture of Gaus- 
sian functions. However, as iterations progress, the 
number of Gaussians in the mixture grows rapidly. A 
direct implementation of a decoder which exploits this 
property is infeasible, and so prior works quantize the 
messages, ignoring the Gaussian nature of the mes- 
sages. 

In this paper, the LDLC decoder messages are rep- 
resented as Gaussian functions, and the growth in the 
number of Gaussians is reduced by a proposed Gaussian 
mixture reduction algorithm. This algorithm approxi- 
mates a number of Gaussians N with a smaller number 
of Gaussians M. The algorithm combines Gaussians 
in a pair-wise fashion iteratively until a stopping con- 
dition is reached. A distance metric, which computes 
the squared difference between a pair of Gaussian func- 
tions, and the single Gaussian which has the same first 
and second moments, is used. 

Section [2] gives a review of the construction and de- 
coding algorithm for low-density lattice codes. If the 
channel noise is Gaussian, then messages in the decod- 
ing algorithm can be represented as a mixture of Gaus- 
sian functions. Section [3] gives a method for replacing 
a pair of Gaussians with a single Gaussian, which is 
applied to an algorithm which reduces a mixture of iV 
Gaussian functions to a mixture of M Gaussians. Sec- 
tion [4] applies this algorithm to the decoding of low- 
density lattice codes, and considers simulation results. 
Section H] is the conclusion. 

2 Low-density Lattice Codes 

2.1 Lattices and Lattice Communication 

A lattice is a regular infinite array of points in W 1 . 
Definition An n-dimcnsional lattice A is the set of 
points x = [x\, X2, . . • , x n ) with 

x = Gb, (1) 

where G is an n-by-n generator matrix and b = 
(&!,..., b n ) is the set of all possible integer vectors, 



bi e Z. 

The following communications system is considered. 
Let the codeword x be an arbitrary point of the lattice 
A. This codeword is transmitted over an AWGN chan- 
nel with known noise variance a 2 , and received as the 
sequence y = y 2 , . . . , y n }: 



y, 



(2) 



where Z{ is the AWGN. A maximum-likelihood decoder 
selects x as the estimated codeword: 



arg max Pr (y I x ) 
xeA 



(3) 



The received codeword is correct if x = x and incor- 
rect otherwise. The power of the transmitted symbol, 
||x|| 2 is unbounded. Instead, power is restricted by the 
volume of the Voronoi region, det(G). 

For this system, Poltyrev [8] showed that for suf- 
ficiently large n, there exists a lattice for which the 
probability of error becomes arbitrarily small, if and 
only if, 



< 



|det(G)| 2 /" 
2ire 



(4) 



Poltyrev's result is in contrast to Shannon's theo- 
rem that the capacity of the Gaussian channel, subject 
to a transmission power constraint, is \ log(l + SNR). 
To achieve capacity while observing the power con- 
straint, the codepoints are on the surface of an n-sphere 
with high probability. 

2.2 LDLC Definition 

Definition A low-density lattice code is a lattice 
with a non-singular generator matrix G, for which 
H = G^ 1 is sparse. 

Regular LDLC's have H matrices with constant row 
and column weight d. Although not necessary, it is 
convenient to assume that det(H) = l/det(G) = 1. 
The non-zero entries are selected pseudo-randomly. 

In a magic square LDLC, the absolute values of the 
d non-zero entries in each row and each column are 
drawn from the set {hi, fi2, ■ ■ • , h^} with h\ > ft. 2 > 
■ ■ ■ > hd > 0- The signs of the entries of H are pseudo- 
randomly changed to minus with probability 0.5. From 
here, (n, d) magic square LDLC's are considered with 
hi = 1, and hi = 1/Vd for i — 2, . . . , d. Such codes 
resulted in only slightly worse performance than other 
weight sequences [7]. 

2.3 LDLC Decoding 

The LDLC decoding algorithm is based upon belief- 
propagation, where messages are real functions cor- 
responding to probability distributions on the sym- 
bols Xi. As with decoding low-density parity check 
codes, the decoding algorithm may be presented on a 
bipartite graph. There are nd variable-to-check mes- 
sages q k {z), and nd check-to-variable messages r k (z), 
k = 1, 2, . . . , nd. 



With an AWGN channel, the initial message is: 
, > 1 ^jh^L 

qk(z) = 



(5) 



for the edge k connected to variable node i. 
2.3.1 Check Node 

For the check node, note that ([TJ can be re-written 



Hx = b, 

which defines a sparse system of equations: 



h 



= h 



(G) 



(7) 



for i = 1, 2, . . . , n, and jk S Z;, where Xi is the columns 
of H which have a non-zero entry in position i. 



Let x k = h k x k , so J2i 



b, where b is an 



teger. The input and output messages are qk(z) and 
rk(z), respectively, for k = 1,2, ... ,d. From for an 
arbitrary i, Xk = 

b - {hixi H h hk-xXk-i + h k+ ix k -i H h h d x d ) 



hk 



or, 



d\k 

(&-!>)■ 



(8) 



The output message r k {z) can be obtained from the 
input messages qi{z),i — 1, . . . , d, i ^ k in four steps, 
Unstretch, Convolution, Extension and Stretch. 

Unstretch is multiplication by h k . The message for 
Xi is q k (z), 



<lk(z) 



(9) 



Convolution The message for X^=i ^ s ^k(z). The 
distribution of the sum of random variables is the con- 
volution of distributions, 



r k (z) 



(§1 



* qk-i * Qk+i * ■ ■ ■ * Qd)(z), (10) 



where * denotes real-number convolution. 

Extension is a shift-and-repeat operation for the un- 
known integer b. Conditioned on a specific value of b, 
the distribution of b — X)f=i ^» ^ s ^k{b — z). Assum- 
ing that b is an arbitrary integer with uniform a priori 
distribution, 



r' k {z) 



OO 

E 

b— — oc 



rife (6- z). 



(11) 



Stretching is multiplication by l/h k . Finally the 
message r k (z) which is the message for is obtained 
as: 



r k {z) 



r'dhkz) 



(12) 



Note that the above operations are linear and can 
be interchanged as is required for an implementation. 



2.3.2 Variable Node 

At variable node i, take the product of incoming 
messages, and normalize. 
Product: 



Normalize: 

<lk(z) 



d\fc 



Un(z). 



(13) 



(14) 



2.3.3 Estimated Codeword and Integer Se- 
quence 

The check node and variable node operations are re- 
peated iteratively until a stopping condition is reached. 
Estimate the transmitted by codeword x by first com- 
puting the a posteriori message Fi(z) for the code sym- 
bol Xi as: 



Fi(z) 



I[r k (z). 



Find Xi as: 



k=l 



arg maxK(z). 



The estimated integer sequence b is: 
b = (Hx), 



(15) 



(16) 



(17) 



where (z) denotes the integer closest to z. 

2.4 Gaussian Mixture Decoder 

When the channel noise is Gaussian, all of the 
LDLC messages can be described as a mixture of Gaus- 
sian functions. From here, "Gaussians" will be used as 
shorthand for "Gaussian functions" . 

In this section, it is assumed that a message f(z) is 
a mixture of N Gaussians, 



JV 



f( z ) = £ CiJV(z; rrii, 



(18) 



where Ci > are the mixing coefficients with X^iLi °i 
1, and 



Af(z; m, v) 



1 



'2ttv 



(19) 



In this way, the message f(z) can be described by a list 
of triples of means, variances and mixing coefficients, 
{(mi,Vi,Ci), (m N ,v N ,c N )} 

In describing the Gaussian mixture decoder, ini- 
tially assume that the input messages to a node consist 
of a single Gaussian, that is JV = 1. 

Check node Without loss of generality, consider 
check node inputs k — 1,2, ...,d — 1 and output 



d. Each input message q k (z) is a single Gaussian 
J\f(z;m k ,v k ). 

The message q k (z) is obtained by multiplying by 
h k , so q k (z) = Af(z;h k m kl h 2 k v k ). 

The message r d {z) is the convolution of q k (z), k — 
l,...,d- 1. So: 



(d-l d-l \ 

z;^2h k m k ,^2h 2 k v k J 
k =l i=l J 



(20) 



The message r' d (z) is r d (z) shifted over all possible 
integers: 

oo / d-l d-l \ 

r'd(z) = N[z;J2 hkmk + b >J2 h k Vk ) ■ 

\ k=l 



6=-c 



fc=l 



The output message r d (z) is obtained by scaling by 
-l/h d , so: 



(z) = £ Afiz; — 



Yl=\ h km k + b YlJi h l v k 



-1 »,2„ 



fe— — OO 



*2 



Variable Node. Let the check-to-variable node mes- 
sages r k (z),k — 1, . . . , (J— 1 be Gaussians Af(z; m k , v k ). 
For notational convenience, let too — iji be the symbol 
received from the channel at node i and let = a be 
the channel variance, as in The output message 
q d (z), the product of these input messages, will also be 
a Gaussian, 



q d {z) = k d N{z; m d , v d ) , 



where, 



1 d_1 1 

1 = 

1 .7 ^ » T 3 r. 



and, 



^d 



fc=0 

d-l 

£ 

fc=0 
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(21) 

(22) 
(23) 



i'd 



d-2 d-l 



(m l -nij) 2 



(27T) 



d-2 



For the general case where the input consists of a 
mixture of Gaussians, at either the check node or the 
variable node, the output can be found by conditioning 
on one element from each input mixture and comput- 
ing a single output Gaussian. The mixing coefficient for 
this Gaussian is the product of the input mixing coef- 
ficients. Then the output is the mixture of these single 
Gaussians created by conditioning all input combina- 
tions. 

The number of Gaussians in each mixture grows 
rapidly as the iterations progress. At the variable 
node, if input k consists of a mixture of N k Gaus- 
sian functions, then the output message will consist of 



NiN 2 ■ ■ ■ Nd-i Gaussian functions. At the check node, 
even if the number of integer shifts is bounded, the 
number of Gaussian functions in the mixture also grows 
as 0(N d ^ 1 ). A naive implementation of this Gaussian 
mixture decoder is prohibitively complex. The follow- 
ing section proposes a technique for approximating a 
large number of Gaussians. 

3 Gaussian Mixture Reduction 

This section describes an algorithm which approxi- 
mates a mixture of Gaussian functions with a smaller 
number of Gaussian functions. 

The algorithm input is a mixture of N Gaussians, 



f(z), as defined in (18 1, given as a list of triples. 
The algorithm output is a list of M triples of means, 



variances and mixing coefficients, { (r 
(m 



) } with C T = 1' that similarly forms 
a Gaussian mixture g(z). With M < N, the output 
mixture should be a good approximation of the input 
mixture: 



1 > "1 



/(*) 



M 

£• 

i=l 



n Af(z;m?,v?). (24) 



First, a metric which describes the error due to re- 
placing a two Gaussians with a single Gaussian is given. 
Then, this is incorporated into a greedy search algo- 
rithm which replaces N Gaussians with M Gaussians. 

3.1 Approximating a Mixture of Two Gaus- 
sians with a Single Gaussian 

Definition The squared difference SD(p||q) between 
two distributions p(z) and q(z) with support Z is de- 
fined as: 



SB(p\\q) 



(p(z) - q{z)) 2 dz (25) 



zez 



Lemma The squared difference SD (pj ] g) has the fol- 
lowing properties: 

• SD(p||g) > for any distributions p and q. 

• SD(p||g) if and only if p = q. 
. SD(p||«) = SD(g||p). 

Lemma The squared difference between the Gaus- 
sian distributions A/" (mi, Vx) and Af(m 2 , v 2 ) is given by 
SD(Af(mi,v 1 ),U(m 2 ,v 2 )) = 

1 1 2 (-l--2) ; . . 

- ; + - ; e 2( 31 + S2 ) , (26) 

2^ttvI 2y/wv2 ^/2tt{vi + v 2 ) 

Lemma The squared difference between a single 
Gaussian Af(m, v) and a mixture of two Gaussians 
CiN(m 1} vi) + c 2 N(m 2 , v 2 ), with c\ + c 2 = 1, is: 




2y/TTV 2y/7TVi 

2ci 

^2tt(v + Vi ) 

2c\c 2 
^2k(v\_ + v 2 ) 



.("-"li 2 2c 2 (™-™ 2 ) 2 

g 2(v + v 1 ) e 2 <« + «2) 

\/2it{v + v 2 ) 



(mi-m 2 )^ 
g »(tn + » 2 ) 



(27) 



There is unfortunately no closed- form expression for 
the minimal squared difference in the previous lemma. 
However, minimizing the Kullback-Leibler divergence 
between the single Gaussian distribution and the mix- 
ture of two Gaussian distributions is tractable; it sim- 
ply amounts to moment matching. Therefore, from 
now we will consider the moment-matched Gaussian 
approximation. 

Lemma The mean m and variance v of a mixture of 
two Gaussian distributions c\ A/"(mi, vi) + c 2 Af(m 2 , v 2 ) 
are given by: 

m — C\m\ + c 2 m 2 (28) 
s = c\{m\ + vi) + c 2 {m\ + v 2 ) 

— c^m\ — 2c\c 2 m,\m 2 — c 2 m 2 . (29) 

Let ti,i = 1,2 denote the triple (mi,Vi,Ci), where 
ci + c 2 is not necessarily one, and let the normalized 
triple be tj = (m^ Uj, Ci/(c\ + c 2 )). The single Gaussian 
which satisfies the property of the Lemma is denoted 
as: 



i = MM(ti,t 2 ), 



(30) 



where t — (m, v, 1), with m and v as given in (28 1 and 



(291 



Definition The Gaussian quadratic loss GQL(p) of 
a probability distribution p is defined as the squared 
difference between p and the Gaussian distribution with 
the same mean m and variance v as p: 



GQL(p) = SD(p|| JV(m,s)). 



(31) 



Corollary The Gaussian quadratic loss of a mixture 
of two Gaussian distributions, 

GQL (ti , £ 2 ) = SD (ci JV(toi , wi ) + c 2 N{m 2 , u 2 ) ||JV(m, «)), 



(281 and (29 1 



is obtained evaluating (27 1, with m and f as given in 



3.2 Approximating N Gaussians with M Gaus- 
sians 

Here, we use the results from the previous subsec- 
tion and propose an algorithm which approximates a 
mixture of N Gaussians with a mixture of N Gaussians. 

Input: list C = {ti, t 2 , ■ ■ ■ , iiv} of TV triples describ- 
ing a Gaussian mixture, and two stopping parameters, 
9 the allowable one-step error (measured by GQL) and 
M, the maximum number of allowable Gaussians in the 
output. 
Algorithm 

1. Initialize the current search list, C, with the input 
list: C <r- C. 

2. Initialize the current error, 9 C , to the minimum 
GQL between all pairs of Gaussians: 

9 C = min GQL(U,tj). 



3. Initialize length of current list, M c = N. 

4. While 9 C < 9 or M c > M: 

(a) Determine the pair of Gaussians with 
the smallest GQL: 

(U,tj) = arg min GQL(ij, tj). 

(b) Add the single Gaussian with the same mo- 
ment as U and tj to the list: 

C *-CUMM(ti,tj). 

(c) Delete ti and tj from list: C <— C\ {ti,tj}. 

(d) Recalcuate the minimum GQL: 

6> c = min GQLfetA 

(e) Decrement the current list length: M c <— 
Af c - 1. 

5. Algorithm output: list of triples C 

Note that two conditions must be satisfied for the 
algorithm to stop. That is, the one-step error may be 
greater than the threshold 9 if the minimum number 
of Gaussians is not yet met. On the other hand, the 
number of output Gaussians may be less than M, if the 
one-step error is sufficiently low. 

4 Gaussian-Mixture Reduction Applied 

to LDLC Decoding 

In this section, the Gaussian mixture reduction al- 
gorithm of Section [3] is applied to the LDLC decoding 
algorithm described in Section |2.4| 

At the check node, observe that the message rk{z), 



then the forward recursion of the check node function 
may be stated as: 



as given in (10 1, can be computed recursively with 



a k (z) and b k {z) defined as: 
a\{z) = qi(z), 



and, 



a k (z) = Ofc_i(z) * qk{z),k 



bd(z) = q d (z), 

b k (z) = b k+1 (z) *q k (z) 1 k = d 



1,.. 



(32) 
-1, (33) 

(34) 
,2. (35) 



Then r k {z) is found using a variation on the forward- 
backward algorithm as: 

ri(z) = b 2 {z), (36) 
r k (z) = a k -i(z) * b k+1 (z), 

k = 2, 3, . . . , d - 1 and, (37) 

r d (z) = a d -i{z). (38) 

The Gaussian mixture reduction algorithm is ap- 



plied after the computation ( 33 1 and ( 35 1 , for each k 



For example if a k (z) is the mixture produced by apply- 
ing the Gaussian mixture reduction algorithm to a k (z), 



ax{z) = 
For fc = 2,3,...,d-l: 

a k (z) = 
a k (z) = 



<7iO), 

a k -i{z) * q k (z), 
GMR(a fc (z)), 



(40) 

(41) 
(42) 



and similarly for the backward recursion. 



Similarly at the variable node, the product ( 13 1 can 



a k (z) = GMR(a k (z)), 



(39) 



be decomposed into a forward and backward recursion. 
In this case as well, the Gaussian mixture reduction 
algorithm is applied after each step of the recursion. 

In the Gaussian mixture reduction algorithm, it is 
desirable to repeat step 4 as long as the current re- 
duced Gaussian function g(z) (represented by C) re- 
mains a good approximation of the input function f(z) 
(represented by C). In practice, it was found that us- 
ing a "local" stopping condition of a threshold on the 
one-step error was sufficient to give a good "global" ap- 
proximation f(z) sa g(z). In many cases, f(z) was well- 
approximated by a single Gaussian, which was found 
by the proposed algorithm. 

However, using an error threshold alone does not 
always restrict the number of output Gaussians, an im- 
portant goal of the mixture reduction algorithm. Thus, 
a second stopping condition, which requires that the 
number of Gaussians be lower than some fixed thresh- 
old, is also enforced. Thus, the Gaussian combining 
may continue while M c > M, even if the one-step er- 
ror threshold has been exceeded. In practice, this did 
not appear to have a detrimental result for a wide range 
of symbol-error rates. 

Simulation results comparing the proposed decoder 
with the quantized decoder [7] are shown in Fig. [l] 
A LDLC with n = 100, d — 5 was used. The symbol 
error rate of a cubic lattice used for transmission is la- 
beled "Uncoded." The horizontal axis is the difference 
between the channel noise variance and the Poltyrev 
capacity, l/2ne, in dB. 

For the parameter selection 9 = 0.5, 1.0, and M < 
6, it was found that the proposed algorithm performed 
with a slight performance loss when the probability of 
symbol error was greater than 10 -5 . For example, with 
9 = 0.5 and M = 6, the loss at a symbol error rate 
of 10~ 5 is less than 0.1 dB. For lower symbol error 
rates, an error floor appears. It may be helpful to con- 
sider this error floor as analogous to quantization error 
floors which appear in the decoding of low-density par- 
ity check codes when insufficient quantization levels are 
used. 

Complexity In the Gaussian mixture reduction al- 
gorithm, the primary complexity is computing the ini- 
tial error, which requires computing the GQL between 
N pairs, a complexity of 0(N 2 ). In the Gaussian 
mixture decoder, the primary complexity the pairwise- 
computation of the outputs, which is 0(M 2 ). These 




Figure 1: Symbol error rate of proposed Gaussian mixture (GM) decoder vs. quantized decoder for n = 100, d = 5. 



numbers N and M are random variables which depend 
upon the nature of the messages, and the effective- 
ness of the Gaussian mixture reduction algorithm. In 
the simulations the maximum value of M was 6, and 
N < kM 2 , where k is the constant number of inte- 
ger shifts, k — 3 was used in the simulations. On the 
other hand, the complexity of the quantized algorithm 
is dominated by a discrete Fourier transform of size 
1/A where A is the quantization bin width, A = 1/128 
was used in the simulations. It is difficult to directly 
make comparisons of the computational complexity of 
the two algorithms. 

The memory required for the proposed algorithm, 
however, is significantly superior. The proposed al- 
gorithm requires storage of 3M (for the mean, vari- 
ance and mixing coefficient), for each message, where 
M < 6. The quantized algorithm, however used 1024 
quantization points for each message. 

5 Conclusion 

LDLC codes can be used for communication over 
unconstrained power channels. In this paper, we pro- 
posed a new LDLC decoding algorithm which exploits 
the Gaussian nature of the decoder messages. The 
core of the algorithm is a Gaussian mixture reduction 
method, which approximates a message by a smaller 
number of Gaussians. As a result, the LDLC algo- 
rithm which tracks the means, variances and mixing 
coefficients of the component Gaussians, rather than 
using quantized messages, was tractable. It was shown 
by computer simulation that this algorithm performs 
nearly as well as the quantized algorithm, when the 
dimension is n — 100, and the probability of symbol 
error is greater than 10~ 5 . 
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